Revision  2 


Evaluation  of  a  Particle  Swarm  Algorithm 
for  Biomechanical  Optimization 

Jaco  F.  Schutte1,  Byung-Il  Koh2,  Jeffrey  A.  Reinbolt1, 

13  1  2 

Benjamin  J.  Fregly  ’  ,  Raphael  T.  Haftka  ,  and  Alan  D.  George 

Department  of  Mechanical  &  Aerospace  Engineering,  Department  of  Electrical  &  Computer 
Engineering,  and  Department  of  Biomedical  Engineering,  University  of  Florida,  Gainesville,  FL 


Re-submitted  as  an  original  full-length  research  paper  to  the 
Journal  of  Biomechanical  Engineering 

September  23,  2004 


Address  correspondence  to: 

B.J.  Fregly,  Ph.D. 

Assistant  Professor 

Department  of  Mechanical  &  Aerospace  Engineering 
231  MAE-A  Building 
PO  Box  116250 
University  of  Florida 
Gainesville,  FL  32611-6250 
Phone:  (352)392-8157 
Fax:  (352)  392-7303 
E-mail:  fregly@ufl.edu 


Key  words:  Biomechanical  optimization,  scaling,  particle  swarm,  system  identification. 


Report  Documentation  Page 

Form  Approved 

OMB  No.  0704-0188 

Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 

VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 

1.  REPORT  DATE 

23  SEP  2004  2- REPORT  TYPE 

3.  DATES  COVERED 

00-00-2004  to  00-00-2004 

4.  TITLE  AND  SUBTITLE 

Evaluation  of  a  Particle  Swarm  Algorithm  for  Biomechanical 

Optimization 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

University  of  Florida, Department  of  Mechanical  &  Aerospace 

Engineering, Gainesville, FL, 32611 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

15.  SUBJECT  TERMS 

16.  SECURITY  CLASSIFICATION  OF:  17.  LIMITATION  OF 

18.  NUMBER  19a.  NAME  OF 

a.  REPORT  b.  ABSTRACT  c.  THIS  PAGE 

unclassified  unclassified  unclassified 

40 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


Revision  2 


Schutte  et  al. 


Abstract 

Optimization  is  frequently  employed  in  biomechanics  research  to  solve  system  identification 
problems,  predict  human  movement,  or  estimate  muscle  or  other  internal  forces  that  cannot  be 
measured  directly.  Unfortunately,  biomechanical  optimization  problems  often  possess  multiple 
local  minima,  making  it  difficult  to  find  the  best  solution.  Furthermore,  convergence  in  gradient- 
based  algorithms  can  be  affected  by  scaling  to  account  for  design  variables  with  different  length 
scales  or  units.  This  study  evaluates  a  recently  developed  version  of  the  particle  swann 
optimization  (PSO)  algorithm  to  address  these  problems.  The  algorithm’s  global  search 
capabilities  were  investigated  using  a  suite  of  difficult  analytical  test  problems,  while  its  scale- 
independent  nature  was  proven  mathematically  and  verified  using  a  biomechanical  test  problem. 
For  comparison,  all  test  problems  were  also  solved  with  three  off-the-shelf  optimization 
algorithms  -  a  global  genetic  algorithm  (GA)  and  multi-start  gradient-based  sequential  quadratic 
programming  (SQP)  and  quasi-Newton  (BFGS)  algorithms.  For  the  analytical  test  problems, 
only  the  PSO  algorithm  was  successful  on  the  majority  of  the  problems.  When  compared  to 
previously  published  results  for  the  same  problems,  PSO  was  more  robust  than  a  global 
simulated  annealing  algorithm  but  less  robust  than  a  different,  more  complex  genetic  algorithm. 
For  the  biomechanical  test  problem,  only  the  PSO  algorithm  was  insensitive  to  design  variable 
scaling,  with  the  GA  algorithm  being  mildly  sensitive  and  the  SQP  and  BFGS  algorithms  being 
highly  sensitive.  The  proposed  PSO  algorithm  (freely  available  with  this  article)  provides  a  new 
off-the-shelf  global  optimization  option  for  difficult  biomechanical  problems,  especially  those 
utilizing  design  variables  with  different  length  scales  or  units. 
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1.  Introduction 

Optimization  methods  are  used  extensively  in  biomechanics  research  to  predict  movement- 
related  quantities  that  cannot  be  measured  experimentally.  Forward  dynamic,  inverse  dynamic, 
and  inverse  static  optimizations  have  been  used  to  predict  muscle,  ligament,  and  joint  contact 
forces  during  experimental  or  predicted  movements  [e.g.,  1-12].  System  identification 
optimizations  have  been  employed  to  tune  a  variety  of  musculoskeletal  model  parameters  to 
experimental  movement  data  [e.g.,  13-17].  Image  matching  optimizations  have  been  performed 
to  align  implant  and  bone  models  to  in  vivo  fluoroscopic  images  collected  during  loaded 
functional  activities  [e.g.,  18-20]. 

Since  biomechanical  optimization  problems  are  typically  nonlinear  in  the  design  variables, 
gradient-based  nonlinear  programming  has  been  the  most  widely  used  optimization  method.  The 
increasing  size  and  complexity  of  biomechanical  models  has  also  led  to  parallelization  of 
gradient-based  algorithms,  since  gradient  calculations  can  be  easily  distributed  to  multiple 
processors  [1-3].  However,  gradient-based  optimizers  can  suffer  from  several  important 
limitations.  They  are  local  rather  than  global  by  nature  and  so  can  be  sensitive  to  the  initial 
guess.  Experimental  or  numerical  noise  can  exacerbate  this  problem  by  introducing  multiple 
local  minima  into  the  problem.  For  some  problems,  multiple  local  minima  may  exist  due  to  the 
nature  of  the  problem  itself.  In  most  situations,  the  necessary  gradient  values  cannot  be  obtained 
analytically,  and  finite  difference  gradient  calculations  can  be  sensitive  to  the  selected  finite 
difference  step  size.  Furthermore,  the  use  of  design  variables  with  different  length  scales  or  units 
can  produce  poorly  scaled  problems  that  converge  slowly  or  not  at  all  [21-22],  necessitating 
design  variable  scaling  to  improve  performance. 

Motivated  by  these  limitations  and  improvements  in  computer  speed,  recent  studies  have 
begun  investigating  the  use  of  non-gradient  global  optimizers  for  biomechanical  applications. 
Neptune  [4]  compared  the  performance  of  a  simulated  annealing  (SA)  algorithm  with  that  of 
downhill  simplex  (DS)  and  sequential  quadratic  programming  (SQP)  algorithms  on  a  forward 
dynamic  optimization  of  bicycle  pedaling  utilizing  27  design  variables.  Simulated  annealing 
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found  a  better  optimum  than  the  other  two  methods  and  in  a  reasonable  amount  of  CPU  time. 
More  recently,  Soest  and  Casius  [5]  evaluated  a  parallel  implementation  of  a  genetic  algorithm 
(GA)  using  a  suite  of  analytical  tests  problems  with  up  to  32  design  variables  and  forward 
dynamic  optimizations  of  jumping  and  isokinetic  cycling  with  up  to  34  design  variables.  The 
genetic  algorithm  generally  outperformed  all  other  algorithms  tested,  including  SA,  on  both  the 
analytical  test  suite  and  the  movement  optimizations. 

This  study  evaluates  a  recent  addition  to  the  arsenal  of  global  optimization  methods  -  particle 
swarm  optimization  (PSO)  -  for  use  on  biomechanical  problems.  A  recently-developed  variant 
of  the  PSO  algorithm  is  used  for  the  investigation.  The  algorithm’s  global  search  capabilities  are 
evaluated  using  a  previously  published  suite  of  difficult  analytical  test  problems  with  multiple 
local  minima  [5],  while  its  insensitivity  to  design  variable  scaling  is  proven  mathematically  and 
verified  using  a  biomechanical  test  problem.  For  both  categories  of  problems,  PSO  robustness, 
performance,  and  scale-independence  are  compared  to  that  of  three  off-the-shelf  optimization 
algorithms  -  a  genetic  algorithm  (GA),  a  sequential  quadratic  programming  algorithm  (SQP),  and 
a  Broydon-Fletcher-Goldfarb-Shanno  (BFGS)  quasi-Newton  algorithm.  In  addition,  previously 
published  results  [5]  for  the  analytical  test  problems  permit  comparison  with  a  more  complex  GA 
algorithm  (GA*),  a  simulated  annealing  algorithm  (SA),  a  different  SQP  algorithm  (SQP*),  and 
a  downhill  simplex  (DS)  algorithm. 

2,  Theory 

2.1  Particle  swarm  algorithm.  Particle  swarm  optimization  is  a  stochastic  global 
optimization  approach  introduced  by  Kennedy  and  Eberhart  [23].  The  method's  strength  lies  in 
its  simplicity,  being  easy  to  code  and  requiring  few  algorithm  parameters  to  define  convergence 
behavior.  The  following  is  a  brief  introduction  to  the  operation  of  the  particle  swarm  algorithm 
based  on  a  recent  implementation  by  Fourie  and  Groenwold  [24]  incorporating  dynamic  inertia 
and  velocity  reduction. 
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Consider  a  swarm  of  p  particles,  where  each  particle's  position  x).  represents  a  possible 
solution  point  in  the  problem  design  space  D.  For  each  particle  i,  Kennedy  and  Eberhart  [23] 
proposed  that  the  position  xlk+l  be  updated  in  the  following  manner: 


-*T+1  —  Xk  Vk+ 1  > 


(1) 


with  a  pseudo-velocity  vlk+]  calculated  as  follows: 

v[+i  =  wkv‘k  +  cxrx  {p[  -  x[ ) 4-  c2r,  (gk  -  x[ )  (2) 

Here,  subscript  k  indicates  a  (unit)  pseudo-time  increment.  The  point  p[  is  the  best-found  cost 
location  by  particle  i  up  to  timestep  k,  which  represents  the  cognitive  contribution  to  the  search 
vector  v'k+x .  Each  component  of  vlk+]  is  constrained  to  be  less  than  or  equal  to  a  maximum  value 
defined  in  v™x .  The  point  gk  is  the  global  best-found  position  among  all  particles  in  the  swarm 
up  to  time  k  and  forms  the  social  contribution  to  the  velocity  vector.  Cost  function  values 
associated  with  p'k  and  gk  are  denoted  by  flbest  and  fb8est  respectively.  Random  numbers  r,  and 
r2  are  uniformly  distributed  in  the  interval  [0,1].  Shi  and  Eberhart  [25]  proposed  that  the 
cognitive  and  social  scaling  parameters  cx  and  c2  be  selected  such  that  cx  =  c2  =  2  to  allow  the 
product  cxrx  or  c2r2  to  have  a  mean  of  1.  The  result  of  using  these  proposed  values  is  that  the 
particles  overshoot  the  attraction  points  p'k  and  gk  half  the  time,  thereby  maintaining  separation 
in  the  group  and  allowing  a  greater  area  to  be  searched  than  if  the  particles  did  not  overshoot. 
The  variable  wk ,  set  to  1  at  initialization,  is  a  modification  to  the  original  PSO  algorithm  [23]. 
By  reducing  its  value  dynamically  based  on  the  cost  function  improvement  rate,  the  search  area 
is  gradually  reduced  [26],  This  dynamic  reduction  behavior  is  defined  by  wd,  the  amount  by 
which  the  inertia  wk  is  reduced,  vd ,  the  amount  by  which  the  maximum  velocity  v™)  is 
reduced,  and  d,  the  number  of  iterations  with  no  improvement  in  gk  before  these  reduction  take 
place  [24]  (see  algorithm  flow  description  below). 

Initialization  of  the  algorithm  involves  several  important  steps.  Particles  are  randomly 
distributed  throughout  the  design  space,  and  particle  velocities  Vg  are  initialized  to  random 
values  within  the  limits  0  <v' <v0max-  The  particle  velocity  upper  limit  v”ax  is  calculated  as  a 
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fraction  of  the  distance  between  the  upper  and  lower  bound  on  variables  in  the  design  space 
v“ax  =  k{xub  ~xlb)  with  k  =  0.5  as  suggested  in  [26].  Iteration  counters  k  and  t  are  set  to  0. 
Iteration  counter  k  is  used  to  monitor  the  total  number  of  swarm  iterations,  while  iteration 
counter  t  is  used  to  monitor  the  number  of  swarm  iterations  since  the  last  improvement  in  gk . 
Thus,  t  is  periodically  reset  to  zero  during  the  optimization  while  k  is  not. 

The  algorithm  flow  can  be  represented  as  follows: 

1 .  Initialize 

(a)  Set  constants  k  ,  cx,  c2 ,  kmax ,  v”ax ,  w0,  vd,  wd,  and  d 

(b)  Set  counters  k  =  0,  t  =  0.  Set  random  number  seed. 

(c)  Randomly  initialize  particle  positions  x'0  e  D  in  for  i  =  1, . . . ,  p 

(d)  Randomly  initialize  particle  velocities  0  <  v‘0  <  v“ax  for  i  =  1, . . . ,  p 

(e)  Evaluate  cost  function  values  /0'  using  design  space  coordinates  x'0  for  /  =  1, . . . ,  p 

(f)  Set  fLt  =  /o'  and  Po  =  xo  for  i  =  h--.,P 

(g)  Set  fbsest  to  best  f‘est  and  g0  to  corresponding  x’ 

2.  Optimize 

(a)  Update  particle  velocity  vectors  v[+1  using  Eq.  (2) 

(b)  If  v[+1  >  v“ax  for  any  component,  then  set  that  component  to  its  maximum  allowable  value 

(c)  Update  particle  position  vectors  xlk+l  using  Eq.  (1) 

(d)  Evaluate  cost  function  values  flM  using  design  space  coordinates  x[  M  for  i  =  1, . . . ,  p 

(e)  ^  fL  ^  fLt » then  fL  =  fL »  PL  =  XL  for  i  =  1, . . . ,  p 

(f)  If  fL i  ^  fL » then  .//,  =  fL ,  gk+ 1  =  -<  ,i  for  /  =  1, . . . ,  p 

(g)  If  fhgest  was  improved  in  (e),  then  reset  t  =  0.  Else  increment  t 

(h)  If  maximum  number  of  function  evaluations  is  exceeded,  then  go  to  3 

(i)  If  t  =  d,  then  multiply  wk+x  by  (l  -  wd )  and  v“ax  by  (l  -  vd ) 

(j)  Increment  k 

(k)  Go  to  2(a). 
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3.  Report  results 

4.  Terminate 

This  algorithm  was  coded  in  the  C  programming  language  by  the  lead  author  [27]  and  used 
for  all  PSO  analyses  performed  in  the  present  study.  A  standard  population  size  of  20  particles 
was  used  for  all  runs,  and  other  algorithm  parameters  were  also  selected  based  on  standard 
recommendations  (Table  1)  [27-29]. 

2.2  Analysis  of  Scale  Sensitivity.  One  of  the  benefits  of  the  PSO  algorithm  is  its 
insensitivity  to  design  variable  scaling.  To  prove  this  characteristic,  we  will  use  a  proof  by 
induction  to  show  that  all  particles  follow  an  identical  path  through  the  design  space  regardless 
of  how  the  Design  variables  are  scaled.  In  actual  PSO  runs  intended  to  investigate  this  property, 
use  of  the  same  random  seed  in  scaled  and  unsealed  cases  will  ensure  that  an  identical  sequence 
of  random  r,  and  r2  values  are  produced  by  the  computer  throughout  the  course  of  the 
optimization. 

Consider  an  optimization  problem  with  n  design  variables.  An  //-dimensional  constant 
scaling  vector  C  can  be  used  to  scale  any  or  all  dimensions  of  the  problem  design  space: 

vr 

C  =  (3) 

Cn_ 

We  wish  to  show  that  for  any  time  step  k  >  0 , 

K=Cvk  (4) 

x'k=Cxk  (5) 

where  xk  and  vk  (dropping  superscript  /')  are  the  unsealed  position  and  velocity,  respectively,  of 
an  individual  particle  and  x[  =  Cxk  and  v[  =  Cvk  are  the  corresponding  scaled  versions. 
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First,  we  must  show  that  our  proposition  is  true  for  the  base  case,  which  involves 
initialization  (k  =  0)  and  the  first  time  step  (A'  =  l).  Applying  the  scaling  vector  C  to  an 
individual  particle  position  jt0  during  initialization  produces  a  scaled  particle  position  jc' : 

Xg=CX0  (6) 


This  implies  that 

Po  ~  £ P(n  So  -  Cgo 

In  the  unsealed  case,  the  pseudo-velocity  is  calculated  as 

v0  =k(xub-xlb) 

In  the  scaled  case,  this  becomes 

vo  =  x(xUB  —  xLB ) 

=  k{^xub  —  C  xLB ) 

—  C\x{.XuB  _  X LB  )] 

=  Cv0 


(7) 


(8) 


(9) 


From  Eqs.  (1)  and  (2)  and  these  initial  conditions,  the  particle  pseudo-velocity  and  position  for 
the  first  time  step  can  be  written  as 

Vi  =%v0  +c1r1(p0  -x0)+c2r2(g0  -  x0)  (10) 


A  =  x0  +  Vj 


(ID 


in  the  unsealed  case  and 

v[  =w0v'0  +  cy,  (p'0  -x'0)+  c2r2  (g'0  -  x'0 ) 

=  w0Cv0+c1r1(Cp0-Cx0)+c2r2((g0-Cx0) 
=  C[w0v0  +  cxrx (p0  -x0)  +  c2r2 (g0  -  x0 )] 

=  Cvx 

x[  =x'Q+  v[ 

=  CxQ+Cvx 
=  C[x0  +  V,  ] 

=  Cx  1 


in  the  scaled  case.  Thus,  our  proposition  is  true  for  the  base  case. 


(12) 


(13) 
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Next,  we  must  show  that  our  proposition  is  true  for  the  inductive  step.  If  we  assume  our 
proposition  holds  for  any  time  step  k  =  j ,  we  must  prove  that  it  also  holds  for  time  step 
k  =  j  + 1 .  We  begin  by  replacing  subscript  k  with  subscript  j  in  Eqs.  (4)  and  (5).  If  we  then 
replace  subscript  0  with  subscript  j  and  subscript  1  with  subscript  j  + 1  in  Eqs.  (12)  and  (13),  we 
arrive  at  Eqs.  (4)  and  (5)  where  subscript  k  is  replaced  by  subscript  j  +  1 .  Thus,  our  proposition 
is  true  for  any  time  step  j  + 1 . 

Consequently,  since  the  base  case  is  true  and  the  inductive  step  is  true,  Eqs.  (4)  and  (5)  are 
true  for  all  k>  0 .  From  Eqs.  (4)  and  (5),  we  can  conclude  that  any  linear  scaling  of  the  design 
variables  (or  subset  thereof)  will  have  no  effect  on  the  final  or  any  intennediate  result  of  the 
optimization,  since  all  velocities  and  positions  are  scaled  accordingly.  This  fact  leads  to  identical 
step  intervals  being  taken  in  the  design  space  for  scaled  and  unsealed  version  of  the  same 
problem,  assuming  infinite  precision  in  all  calculations. 

In  contrast,  gradient-based  optimization  methods  are  often  sensitive  to  design  variable 
scaling  due  to  algorithmic  issues  and  numerical  approximations.  First  derivative  methods  are 
sensitive  because  of  algorithmic  issues,  as  illustrated  by  a  simple  example.  Consider  the 


following  minimization  problem  with  two  design  variables  ( x ,  y)  where  the  cost  function  is 


2  V 
x  +  — 


with  initial  guess  (1,1).  A  scaled  version  of  the  same  problem  can  be  created  by  letting 
x  =  x,y  =  y  / 10  so  that  the  cost  function  becomes 

x2+y2  (15) 


with  initial  guess  (1,10).  Taking  first  derivatives  of  each  cost  function  with  respect  to  the 
corresponding  design  variables  and  evaluating  at  the  initial  guesses,  the  search  direction  for  the 
unsealed  problem  is  along  a  line  rotated  5.7°  from  the  positive  x  axis  and  for  the  scaled  problem 
along  a  line  rotated  45°.  To  reach  the  optimum  in  a  single  step,  the  unsealed  problem  requires  a 
search  direction  rotated  84.3°  and  the  scaled  problem  45°.  Thus,  the  scaled  problem  can 
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theoretically  reach  the  optimum  in  a  single  step  while  the  unsealed  problem  cannot  due  to  the 
effect  of  scaling  on  the  calculated  search  direction. 

Second  derivative  methods  are  sensitive  to  design  variable  scaling  because  of  numerical 
issues  related  to  approximation  of  the  Hessian  (second  derivative)  matrix.  According  to  Gill  et 
al.  [21],  Newton  methods  utilizing  an  exact  Hessian  matrix  will  be  insensitive  to  design  variable 
scaling  as  long  as  the  Hessian  matrix  remains  positive  definite.  However,  in  practice,  exact 
Hessian  calculations  are  almost  never  available,  necessitating  numerical  approximations  via 
finite  differencing.  Errors  in  these  approximations  result  in  different  search  directions  for  scaled 
versus  unsealed  versions  of  the  same  problem.  Even  a  small  amount  of  design  variable  scaling 
can  significantly  affect  the  Hessian  matrix  so  that  design  variable  changes  of  similar  magnitude 
will  not  produce  comparable  magnitude  cost  function  changes  [21],  Common  gradient-based 
algorithms  that  employ  an  approximate  Hessian  include  Newton  and  quasi-Newton  nonlinear 
programming  methods  such  as  BFGS,  SQP  methods,  and  nonlinear  least-squares  methods  such 
as  Levenberg-Marquardt  [21].  A  detailed  discussion  of  the  influence  of  design  variable  scaling 
on  optimization  algorithm  performance  can  be  found  in  Gill  et  al.  [21], 

3.  Methodology 

3.1  Optimization  Algorithms 

In  addition  to  our  PSO  algorithm,  three  off-the-shelf  optimization  algorithms  were  applied  to 
all  test  problems  (analytical  and  biomechanical  -  see  below)  for  comparison  purposes.  One  was  a 
global  GA  algorithm  developed  by  Deb  [30-32].  This  basic  GA  implementation  utilizes  one 
mutation  operator  and  one  crossover  operator  along  with  real  encoding  to  handle  continuous 
variables.  The  other  two  algorithms  were  commercial  implementations  of  gradient-based  SQP 
and  BFGS  algorithms  (VisualDOC,  Vanderplaats  R  &  D,  Colorado  Springs,  CO). 

All  four  algorithms  (PSO,  GA,  SQP,  and  BFGS)  were  parallelized  to  accommodate  the 
computational  demands  of  the  biomechanical  test  problem.  For  the  PSO  algorithm, 
parallelization  was  performed  by  distributing  individual  particle  function  evaluations  to  different 
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processors  as  detailed  in  [33].  For  the  GA  algorithm,  individual  chromosome  function 
evaluations  were  parallelized  as  described  in  [5].  Finally,  for  the  SQP  and  BFGS  algorithms, 
finite  difference  gradient  calculations  were  perfonned  on  different  processors  as  outlined  in  [34]. 
A  master-slave  paradigm  using  the  Message  Passing  Interface  (MPI)  [35-36]  was  employed  for 
all  parallel  implementations.  Parallel  optimizations  for  the  biomechanical  test  problem  were  run 
on  a  cluster  of  Linux-based  PCs  in  the  University  of  Florida  High-performance  Computing  and 
Simulation  Research  Laboratory  (1.33  GHz  Athlon  CPUs  with  256MB  memory  on  a  100Mbps 
switched  Fast  Ethernet  network). 

While  the  PSO  algorithm  used  standard  algorithm  parameters  for  all  optimization  runs,  minor 
algorithm  tuning  was  perfonned  on  the  GA,  SQP,  and  BFGS  algorithms  for  the  biomechanical 
test  problem.  The  goal  was  to  give  these  algorithms  the  best  possible  chance  for  success  against 
the  PSO  algorithm.  For  the  GA  algorithm,  preliminary  optimizations  were  performed  using 
population  sizes  ranging  from  40  to  100.  It  was  found  that  for  the  specified  maximum  number  of 
function  evaluations,  a  population  size  of  60  produced  the  best  results.  Consequently,  this 
population  size  was  used  for  all  subsequent  optimization  runs  (analytical  and  biomechanical). 
For  the  SQP  and  BFGS  algorithms,  automatic  tuning  of  the  finite  difference  step  size  (FDSS) 
was  performed  separately  for  each  design  variable.  At  the  start  of  each  gradient-based  run, 
forward  and  central  difference  gradients  were  calculated  for  each  design  variable  beginning  with 
a  relative  FDSS  of  1CT1 .  The  step  size  was  then  incrementally  decreased  by  factors  of  ten  until 
the  absolute  difference  between  forward  and  central  gradient  results  was  a  minimum.  This 
approach  was  taken  since  the  amount  of  noise  in  the  biomechanical  test  problem  prevented  a 
single  stable  gradient  value  from  being  calculated  over  a  wide  range  of  FDSS  values  (see 
Discussion).  The  forward  difference  step  size  automatically  selected  for  each  design  variable  was 
used  for  the  remainder  of  the  run. 

3.2  Analytical  Test  Problems 

The  global  search  capabilities  of  our  PSO  implementation  were  evaluated  using  a  suite  of 
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difficult  analytical  test  problems  previously  published  by  Soest  and  Casius  [5],  In  that  study, 
each  problem  in  the  suite  was  evaluated  using  four  different  optimizers:  SA,  GA*,  SQP*,  and 
DS,  where  a  star  indicates  a  different  version  of  an  algorithm  used  in  our  study.  One  thousand 
optimization  runs  were  performed  with  each  optimizer  starting  from  random  initial  guesses  and 
using  standard  optimization  algorithm  parameters.  Each  run  was  terminated  based  on  a  pre¬ 
defined  number  of  function  evaluations  for  the  particular  problem  being  solved.  We  followed  an 
identical  procedure  with  our  four  algorithms  to  pennit  comparison  between  our  results  and  those 
published  in  [5],  Since  two  of  the  algorithms  used  in  our  study  (GA  and  SQP)  were  of  the  same 
general  category  as  algorithms  used  [5]  (GA*  and  SQP*),  comparisons  could  be  made  between 
different  implementations  of  the  same  general  algorithm.  Failed  PSO  and  GA  runs  were  allowed 
to  use  up  the  full  number  of  function  evaluations,  whereas  failed  SQP  and  BFGS  runs  were  re¬ 
started  from  new  random  initial  guesses  until  the  full  number  of  function  evaluations  was 
completed.  Only  100  rather  than  1000  runs  were  perfonned  with  the  SQP  and  BFGS  algorithms 
due  to  a  database  size  problem  in  the  VisualDOC  software. 

A  detailed  description  of  the  six  analytical  test  problems  can  be  found  in  Soest  and  Casius 
[5].  Since  the  design  variables  for  each  problem  possessed  the  same  absolute  upper  and  lower 
bound  and  appeared  in  the  cost  function  in  a  similar  form,  design  variable  scaling  was  not  an 
issue  in  these  problems.  The  six  analytical  test  problems  are  described  briefly  below. 

Hx :  This  simple  2-dimensional  function  [5]  has  several  local  maxima  and  a  global  maximum 
of  2  at  the  coordinates  (8.6998,  6.7665). 


H  iC*i,x2) 


ys 

1 

+  sin2 

r  jo 

.w  - 

l  8; 

l  8  J 

d  + 1 


(16) 


xt,x2  e  [-100,100] 

where  d  =  y](xx  -8.6998)2  +(x2  -6.7665)2 

Ten  thousand  function  evaluations  were  used  for  this  problem. 
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H , :  This  inverted  version  of  the  F6  function  used  by  Schaffer  et  al.  [37]  has  2  dimensions 


with  several  local  maxima  around  the  global  maximum  of  1.0  at  (0,0). 


/  x  sin  U/x,  +  x,  -  0.5 

H  (x  x  1  =  05 _ _ 11 _ 

(l  +  0.00 1  (x,  +  x2)j 

Xj,x2  €  [-100,100] 


(17) 


This  problem  was  solved  using  20,000  function  evaluations  per  optimization  run. 

//, :  This  test  function  from  Corana  et  al.  [38]  was  used  with  dimensionality  n  =  4,  8,  16,  and 
32.  The  function  contains  a  large  number  of  local  minima  (on  the  order  of  104")  with  a  global 
minimum  of  0  at  lx,  I  <  0.05  . 


H  Y  )  =  J  {(* '  sgn(Z' )  +  z'  )2  ' c  '  d>  if  K'  “  z'  I  <  ^ 

'  d,  •  x? 


/=! 


otherwise 


(18) 


x. 


[-1000,1000] 


where 


2;  = 


+  0.49999 


•sgn(x,)-5,  c  =  0.15, 


5  =  0.2  ,  t  =  0.05  ,  and  t/  = 


1  /=  1,5,9,... 

1000  i  =  2,6,10,... 
10  i=3,7,ll,... 

100  i  =  4,8,12,... 


The  use  of  the  floor  function  in  Eq.  (18)  makes  the  search  space  for  this  problem  the  most 
discrete  of  all  problems  tested.  The  number  of  function  evaluations  used  for  this  problem  was 
50,000  (n  =  4),  100,000  (n  =  8),  200,000  (n  =  16),  and  400,000  (n  =  32). 

For  all  of  the  analytical  test  problems,  an  algorithm  was  considered  to  have  succeeded  if  it 
converged  to  within  10  3  of  the  known  optimum  cost  function  value  within  the  specified  number 
of  function  evaluations  [5]. 


3.3  Biomechanical  Test  Problem 

In  addition  to  these  analytical  test  problems,  a  biomechanical  test  problem  was  used  to 
evaluate  the  scale-independent  nature  of  the  PSO  algorithm.  Though  our  PSO  algorithm  is 
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theoretically  insensitive  to  design  variable  scaling,  numerical  round-off  errors  and 
implementation  details  could  potentially  produce  a  scaling  effect.  Running  the  other  three 
algorithms  on  scaled  and  unsealed  versions  of  this  test  problem  also  permitted  investigation  of 
the  extent  to  which  other  algorithms  are  influenced  by  design  variable  scaling. 

The  biomechanical  test  problem  involved  determination  of  an  ankle  joint  kinematic  model 
that  best  matched  noisy  synthetic  (i.e.,  computer  generated)  movement  data.  Similar  to  [13],  the 
ankle  was  modeled  as  a  three-dimensional  linkage  with  two  non-intersecting  pin  joints  defined 
by  12  subject-specific  parameters  (Fig.  1).  These  parameters  represent  the  positions  and 
orientations  of  the  talocrural  and  subtalar  joint  axes  in  the  tibia,  talus,  and  calcaneous.  Position 
parameters  were  in  units  of  centimeters  and  orientation  parameters  in  units  of  radians,  resulting 
in  parameter  values  of  varying  magnitude.  This  model  was  part  of  a  larger  27  degree-of-freedom 
(DOF)  full-body  kinematic  model  used  to  optimize  other  joints  as  well  [15]. 

Given  this  model  structure,  noisy  synthetic  movement  data  were  generated  from  a  nominal 
model  for  which  the  "true"  model  parameters  were  known.  Joint  parameters  for  the  nominal 
model  along  with  a  nominal  motion  were  derived  from  in  vivo  experimental  movement  data 
using  the  optimization  methodology  described  below.  Next,  three  markers  were  attached  to  the 
tibia  and  calcaneous  segments  in  the  model  at  locations  consistent  with  the  experiment,  and  the 
27  model  DOFs  were  moved  through  their  nominal  motions.  This  process  created  synthetic 
marker  trajectories  consistent  with  the  nominal  model  parameters  and  motion  and  also 
representative  of  the  original  experimental  data.  Finally,  numerical  noise  was  added  to  the 
synthetic  marker  trajectories  to  emulate  skin  and  soft  tissue  movement  artifacts.  For  each  marker 
coordinate,  a  sinusoidal  noise  function  was  used  with  uniformly  distributed  random  period, 
phase,  and  amplitude  (limited  to  a  maximum  of  ±  1  cm).  The  values  of  the  sinusoidal  parameters 
were  based  on  previous  studies  reported  in  the  literature  [39,40]. 

An  unconstrained  optimization  problem  with  bounds  on  the  design  variables  was  formulated 
to  attempt  to  recover  the  known  joint  parameters  from  the  noisy  synthetic  marker  trajectories. 
The  cost  function  was 
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min  f(p)  (19) 

P 

with 

50  6  3  2 

f(p)  =  £min££  (c//*  -  %  (p>  9 ))  (20) 

A-=l  ?  j=\  i= 1 

where  p  is  a  vector  of  12  design  variables  containing  the  joint  parameters,  q  is  a  vector  of  27 
generalized  coordinates  for  the  kinematic  model,  cljk  is  the  /th  coordinate  of  synthetic  marker  j  at 
time  frame  k,  and  cijk(p,q )  is  the  corresponding  marker  coordinate  from  the  kinematic  model.  At 
each  time  frame,  cijk{p,q )  was  computed  from  the  current  model  parameters p  and  an  optimized 
model  configuration  q.  A  separate  Levenberg-Marquardt  nonlinear  least-squares  optimization 
was  performed  for  each  time  frame  in  Eq.  (20)  to  detennine  this  optimal  configuration.  A 
relative  convergence  tolerance  of  10  3  was  chosen  to  achieve  good  accuracy  with  minimum 
computational  cost.  A  nested  optimization  formulation  (i.e.,  minimization  occurs  in  Eqs.  (19) 
and  (20))  was  used  to  decrease  the  dimensionality  of  the  design  space  in  Eq.  (19).  Equation  (20) 
was  coded  in  Matlab  and  exported  as  stand-alone  C  code  using  the  Matlab  Compiler  (The 
Mathworks,  Natick,  MA).  The  executable  read  in  a  file  containing  the  12  design  variables  and 
output  a  file  containing  the  resulting  cost  function  value.  This  approach  facilitated  the  use  of 
different  optimizers  to  solve  Eq.  (19). 

To  investigate  the  influence  of  design  variable  scaling  on  optimization  algorithm 
performance,  two  versions  of  Eq.  (20)  were  generated.  The  first  used  the  original  units  of 
centimeters  and  radians  for  the  position  and  orientation  design  variables  respectively.  Bounds  on 
the  design  variables  were  chosen  to  enclose  a  physically  realistic  region  around  the  solution  point 
in  design  space.  Each  position  design  variable  was  constrained  to  remain  within  a  cube  centered 
at  the  midpoint  of  the  medial  and  lateral  malleoli,  where  the  length  of  each  side  was  equal  to  the 
distance  between  the  malleoli  (i.e.,  11.32  cm).  Each  orientation  design  variable  was  constrained 
to  remain  within  a  circular  cone  defined  by  varying  its  “true”  value  by  ±  30°.  The  second  version 
nonnalized  all  12  design  variables  to  be  within  [-1,1]  using 
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g norm  _  2 X  Xy g  X LB  (21) 

*UB  ~~  *LB 

where  UB  and  LB  denote  the  upper  and  lower  bounds,  respectively,  on  the  design  variable  vector 
[41]. 

Two  approaches  were  used  to  compare  PSO  scale  sensitivity  to  that  of  the  other  three 
algorithms.  For  the  first  approach,  a  fixed  number  of  scaled  and  unsealed  runs  (10)  was 
performed  with  each  optimization  algorithm  starting  from  different  random  initial  seeds,  and  the 
sensitivity  of  the  final  cost  function  value  to  algorithm  choice  and  design  variable  scaling  was 
evaluated.  The  stopping  condition  for  PSO  and  GA  runs  was  10,000  function  evaluations,  while 
SQP  and  BFGS  runs  were  terminated  when  a  relative  convergence  tolerance  of  10  or  absolute 
convergence  tolerance  of  10  6  was  met.  For  the  second  approach,  a  fixed  number  of  function 
evaluations  (10,000)  were  performed  with  each  algorithm  to  investigate  unsealed  versus  scaled 
convergence  history.  A  single  random  initial  guess  was  used  for  the  PSO  and  GA  algorithms, 
and  each  algorithm  was  terminated  once  it  reached  10,000  function  evaluations.  Since  individual 
SQP  and  BFGS  runs  require  much  fewer  than  10,000  function  evaluations,  repeated  runs  were 
performed  with  different  random  initial  guesses  until  the  total  number  of  function  evaluations 
exceeded  10,000  at  the  termination  of  a  run.  This  approach  essentially  uses  SQP  and  BFGS  as 
global  optimizers,  where  the  separate  runs  are  like  individual  particles  that  cannot  communicate 
with  each  another  but  have  access  to  local  gradient  information.  Finite  difference  step  size  tuning 
at  the  start  of  each  run  was  included  in  the  computation  of  number  of  function  evaluations.  Once 
the  total  number  of  runs  required  to  reach  10,000  function  evaluations  was  known,  the  lowest 
cost  function  value  from  all  runs  at  each  iteration  was  used  to  represent  the  cost  over  a  range  of 
function  evaluations  equal  to  the  number  of  runs. 

4.  Results 

For  the  analytical  test  problems,  our  PSO  algorithm  was  more  robust  than  our  GA,  SQP,  and 
BFGS  algorithms  (Table  2).  PSO  converged  to  the  correct  global  solution  nearly  100%  of  the 
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time  on  four  of  the  six  test  problems  ( Hx  and  H3  with  n  =  4,  8,  and  16).  It  converged  67%  of  the 
time  for  problem  H2  and  only  1.5%  of  the  time  for  problem  //  ,  with  n  =  32.  In  contrast,  none  of 
the  other  algorithms  converged  more  than  32%  of  the  time  on  any  of  the  analytical  test  problems. 
Though  our  GA  algorithm  typically  exhibited  faster  initial  convergence  than  did  our  PSO 
algorithm  (Fig.  2),  it  leveled  off  and  rarely  reached  the  correct  final  point  in  design  space  within 
the  specified  number  of  function  evaluations. 

When  PSO  results  were  compared  with  previously  published  results  for  the  same  analytical 
test  problems  [5],  PSO  success  rates  were  better  than  those  of  SA  but  worse  than  those  of  GA* 
(Table  Al).  SA  was  successful  100%  of  the  time  only  on  problem  Hx,  while  GA*  was 
successful  nearly  100%  of  the  time  on  all  six  problems.  Thus,  for  the  global  algorithms,  GA* 
was  the  most  robust  overall,  followed  by  PSO  and  then  SA,  with  GA  exhibiting  the  worst 
robustness.  PSO  converged  more  slowly  than  did  GA*  on  all  problems  with  available 
convergence  plots  (four  of  the  six  problems)  and  also  more  slowly  than  did  SA  on  the  one 
problem  for  which  SA  was  successful  (Fig.  Al). 

For  the  biomechanical  test  problem,  only  the  PSO  algorithm  was  insensitive  to  design 
variable  scaling,  with  the  GA  algorithm  being  only  mildly  sensitive.  In  ten  out  of  ten  trials, 
unsealed  and  scaled  PSO  runs  converged  to  the  same  point  in  design  space  (Fig.  3a),  while 
unsealed  and  scaled  GA  runs  converged  to  nearly  the  same  point  (Fig.  3b).  PSO  results  were  the 
most  consistent  from  trial  to  trial,  converging  to  a  final  cost  function  value  between  69  and  70. 
(Table  3).  GA  results  were  the  next  most  consistent  with  final  cost  function  values  ranging  from 
71  to  84.  Typical  unsealed  and  scaled  PSO  and  GA  runs  produced  root-mean-square  (RMS) 
marker  distance,  position  parameter,  and  orientation  parameter  errors  of  comparable  magnitude, 
with  PSO  errors  generally  being  slightly  smaller. 

In  contrast,  the  SQP  and  BFGS  algorithms  were  highly  sensitive  to  design  variable  scaling  in 
the  biomechanical  test  problem.  For  the  ten  trials,  unsealed  and  scaled  SQP  or  BFGS  runs  rarely 
converged  to  similar  points  in  design  space  (note  y  axis  scale  in  Fig.  3)  and  produced  large 
differences  in  final  cost  function  value  from  one  trial  to  the  next  (Fig.  3c  and  d).  Scaling 
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improved  the  final  result  in  seven  out  of  ten  SQP  trials  and  in  five  of  ten  BFGS  trials.  The  best 
unsealed  and  scaled  SQP  final  cost  function  values  were  255  and  121,  respectively,  while  those 
of  BFGS  were  355  and  102  (Table  3).  Thus,  scaling  yielded  the  best  result  found  with  both 
algorithms.  The  best  SQP  and  BFGS  trials  generally  produced  larger  RMS  marker  distance 
errors  (up  to  two  times  worse),  orientation  parameter  errors  (up  to  five  times  worse),  and  position 
parameter  errors  (up  to  six  times  worse)  than  those  found  by  PSO  or  GA. 

When  detailed  convergence  histories  were  plotted  over  10,000  function  evaluations  for  the 
biomechanical  test  problem  (Fig.  4),  unsealed  and  scaled  histories  for  PSO  were 
indistinguishable,  while  those  of  GA  were  similar  and  those  of  SQP  or  BFGS  notably  different. 
For  PSO,  only  minute  differences  in  the  design  variables  on  the  order  of  1CT5  were  observed, 
resulting  from  numerical  round  off  errors  caused  by  limitations  in  machine  precision.  To  reach 
10,000  function  evaluations,  17  unsealed  and  26  scaled  SQP  runs  were  required  compared  to  56 
unsealed  and  44  scaled  runs  for  BFGS.  The  length  and  value  of  the  initial  flat  region  in  the  SQP 
and  BFGS  convergence  histories  was  related  to  the  number  of  FDSS  tuning  evaluations 
performed.  More  runs  meant  more  tuning  evaluations  as  well  as  an  increased  likelihood  of 
finding  a  lower  initial  cost  function  value. 

5,  Discussion 

This  paper  evaluated  a  recent  variation  of  the  PSO  algorithm  with  dynamic  inertia  and 
velocity  updating  as  a  possible  addition  to  the  arsenal  of  methods  that  can  be  applied  to  difficult 
biomechanical  optimization  problems.  For  all  problems  investigated,  our  PSO  algorithm  with 
standard  algorithm  parameters  performed  better  than  did  three  off-the-self  optimizers  -  GA, 
SQP,  and  BFGS.  For  the  analytical  test  problems,  PSO  robustness  was  found  to  be  better  than 
that  of  two  other  global  algorithms  but  worse  than  that  of  a  third.  For  the  biomechanical  test 
problem  with  added  numerical  noise,  PSO  was  found  to  be  insensitive  to  design  variable  scaling 
while  GA  was  only  mildly  sensitive  and  SQP  and  BFGS  highly  sensitive.  Overall,  the  results 
suggest  that  our  PSO  algorithm  is  worth  consideration  for  difficult  biomechanical  optimization 


16 


Revision  2 


Schutte  et  al. 


problems,  especially  those  for  which  design  variable  scaling  may  be  an  issue.  Making  our  PSO 
algorithm  freely  available  provides  a  new  off-the-shelf  option  for  such  problems. 

Though  our  biomechanical  optimization  involved  a  system  identification  problem,  PSO  may 
be  equally  applicable  to  problems  involving  forward  dynamic,  inverse  dynamic,  inverse  static,  or 
image  matching  analyses.  Other  global  methods  such  as  SA  and  GA  have  already  been  applied 
successfully  to  such  problems  [4-5,19],  and  there  is  no  reason  to  believe  that  PSO  would  not 
perform  equally  well.  As  with  any  global  optimizer,  PSO  utilization  would  be  limited  by  the 
computational  cost  of  function  evaluations  given  the  large  number  required  for  a  global  search. 

Our  particle  swarm  implementation  may  also  be  applicable  to  some  large-scale 
biomechanical  optimization  problems.  Outside  the  biomechanics  arena  [28-29,42-51],  PSO  has 
been  used  to  solve  problems  on  the  order  of  120  design  variables  [49-51].  In  the  present  study, 
our  PSO  algorithm  was  unsuccessful  on  the  largest  test  problem,  H3  with  n  =  32  design 
variables.  However,  in  a  recent  study,  our  PSO  algorithm  successfully  solved  the  Griewank 
global  test  problem  with  128  design  variables  using  population  sizes  ranging  from  16  to  128 
[33].  When  the  Corana  test  problem  (  //, )  was  attempted  with  128  DVs,  the  algorithm  exhibited 
worse  convergence.  Since  the  Griewank  problem  possesses  a  bumpy  but  continuous  search  space 
and  the  Corana  problem  a  highly  discrete  search  space,  our  PSO  algorithm  may  work  best  on 
global  problems  with  a  continuous  search  space.  It  is  not  known  how  our  PSO  algorithm  would 
perform  on  biomechanical  problems  with  several  hundred  DVs,  such  as  the  forward  dynamic 
optimizations  of  jumping  and  walking  performed  with  parallel  SQP  in  [1-3], 

One  advantage  of  global  algorithms  such  as  PSO,  GA,  and  SA  is  that  they  often  do  not 
require  significant  algorithm  parameter  tuning  to  perform  well  on  difficult  problems.  The  GA 
used  in  [5]  (which  is  not  freely  available)  required  no  tuning  to  perform  well  on  all  of  these 
particular  analytical  test  problems.  The  SA  algorithm  in  [5]  required  tuning  of  two  parameters  to 
improve  algorithm  robustness  significantly  on  those  problems.  Our  PSO  algorithm  (which  is 
freely  available  with  this  article)  required  tuning  of  one  parameter  ( wd ,  which  was  increased 
from  1.0  to  1.5)  to  produce  100%  success  on  the  two  problems  where  it  had  significant  failures. 
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For  the  biomechanical  test  problem,  our  PSO  algorithm  required  no  tuning,  and  only  the 
population  size  of  our  GA  algorithm  required  tuning  to  improve  convergence  speed.  Neither 
algorithm  was  sensitive  to  the  two  sources  of  noise  present  in  the  problem  -  noise  added  to  the 
synthetic  marker  trajectories,  and  noise  due  to  a  somewhat  loose  convergence  tolerance  in  the 
Levenberg-Marquardt  sub-optimizations.  Thus,  for  many  global  algorithm  implementations, 
poor  performance  on  a  particular  problem  can  be  rectified  by  minor  tuning  of  a  small  number  of 
algorithm  parameters. 

In  contrast,  gradient-based  algorithms  such  as  SQP  and  BFGS  can  require  a  significant 
amount  of  tuning  even  to  begin  to  approach  global  optimizer  results  on  some  problems.  For  the 
biomechanical  test  problem,  our  SQP  and  BFGS  algorithms  were  highly  tuned  by  scaling  the 
design  variables  and  determining  the  optimal  FDSS  for  each  design  variable  separately.  FDSS 
tuning  was  especially  critical  due  to  the  two  sources  of  noise  noted  above.  When  forward  and 
central  difference  gradient  results  were  compared  for  one  of  the  design  variables  using  two 
different  Levenberg-Marquardt  relative  convergence  tolerances  ( 1 0  and  1 0  6 ),  a  "sweet  spot" 
was  found  near  a  step  size  of  10  2  (Fig.  5).  Outside  of  that  "sweet  spot,"  which  was 
automatically  identified  and  used  in  generating  our  SQP  and  BFGS  results,  forward  and  central 
difference  gradient  results  diverged  quickly  when  the  looser  tolerance  was  used.  Since  most 
users  of  gradient-based  optimization  algorithms  do  not  scale  the  design  variables  or  tune  the 
FDSS  for  each  design  variable  separately,  and  many  do  not  perform  multiple  runs,  our  SQP  and 
BFGS  results  for  the  biomechanical  test  problem  represent  best-case  rather  than  typical  results. 
For  this  particular  problem,  an  off-the-shelf  global  algorithm  such  as  PSO  or  GA  is  preferable 
due  to  the  significant  reduction  in  effort  required  to  obtain  repeatable  and  reliable  solutions. 

Another  advantage  of  PSO  and  GA  algorithms  is  the  ease  with  which  they  can  be  parallelized 
[5,33]  and  their  resulting  high  parallel  efficiency.  For  our  PSO  algorithm,  Schutte  et  al.  [33] 
recently  reported  near  ideal  parallel  efficiency  for  up  to  32  processors.  Soest  and  Casius  [5] 
reported  near  ideal  parallel  efficiency  for  their  GA  algorithm  with  up  to  40  processors.  Though 
SA  has  historically  been  considered  more  difficult  to  parallelize  [52],  Higginson  et  al.  [53] 
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recently  developed  a  new  parallel  SA  implementation  and  demonstrated  near  ideal  parallel 
efficiency  for  up  to  32  processors.  In  contrast,  Koh  et  al.  [34]  reported  poor  SQP  parallel 
efficiency  for  up  to  12  processors  due  to  the  sequential  nature  of  the  line  search  portion  of  the 
algorithm. 

The  caveat  for  these  parallel  efficiency  results  is  that  the  time  required  per  function 
evaluation  was  approximately  constant  and  the  computational  nodes  were  homogeneous.  As 
shown  in  [33],  when  function  evaluations  take  different  amounts  of  time,  parallel  efficiency  of 
our  PSO  algorithm  (and  any  other  synchronous  parallel  algorithm,  including  GA,  SA,  SQP,  and 
BFGS)  will  degrade  with  increasing  number  of  processors.  Synchronization  between  individuals 
in  the  population  or  between  individual  gradient  calculations  requires  slave  computational  nodes 
that  have  completed  their  function  evaluations  to  sit  idle  until  all  nodes  have  returned  their 
results  to  the  master  node.  Consequently,  the  slowest  computational  node  (whether  loaded  by 
other  users,  performing  the  slowest  function  evaluation,  or  possessing  the  slowest  processor  in  a 
heterogeneous  environment)  will  dictate  the  overall  time  for  each  parallel  iteration.  An 
asynchronous  PSO  implementation  with  load  balancing,  where  the  global  best-found  position  is 
updated  continuously  as  each  particle  completes  a  function  evaluation,  could  address  this 
limitation.  However,  the  extent  to  which  convergence  characteristics  and  scale  independence 
would  be  affected  is  not  yet  known. 

To  put  the  results  of  our  study  into  proper  perspective,  one  must  remember  that  optimization 
algorithm  robustness  can  be  influenced  heavily  by  algorithm  implementation  details,  and  no 
single  optimization  algorithm  will  work  for  all  problems.  For  two  of  the  analytical  test  problems 
( II 2  and  //,  with  n  =  4),  other  studies  have  reported  PSO  results  using  formulations  that  did  not 
include  dynamic  inertia  and  velocity  updating.  Comparisons  are  difficult  given  differences  in  the 
maximum  number  of  function  evaluations  and  number  of  particles,  but  in  general,  algorithm 
modifications  were  (not  surprisingly)  found  to  influence  algorithm  convergence  characteristics 
[54-56].  For  our  GA  and  SQP  algorithms,  results  for  the  analytical  test  problems  were  very 
different  from  those  obtained  in  [5]  using  different  GA  and  SQP  implementations.  With  seven 
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mutation  and  four  crossover  operators,  the  GA  algorithm  used  in  [5]  was  obviously  much  more 
complex  than  the  one  used  here.  In  contrast,  both  SQP  algorithms  were  highly-developed 
commercial  implementations.  In  contrast,  poor  performance  by  a  gradient-based  algorithm  can 
be  difficult  to  correct  even  with  design  variable  scaling  and  careful  tuning  of  the  FDSS.  These 
findings  indicate  that  specific  algorithm  implementations,  rather  than  general  classes  of 
algorithms,  must  be  evaluated  to  reach  any  conclusions  about  algorithm  robustness  and 
performance  on  a  particular  problem. 

6.  Conclusion 

In  summary,  the  PSO  algorithm  with  dynamic  inertia  and  velocity  updating  provides  another 
option  for  difficult  biomechanical  optimization  problems  with  the  added  benefit  of  being  scale 
independent.  There  are  few  algorithm-specific  parameters  to  adjust,  and  standard  recommended 
settings  work  well  for  most  problems  [60,85].  The  algorithm’s  main  drawback  is  the  high  cost  in 
tenns  of  function  evaluations  because  of  slow  convergence  in  the  final  stages  of  the 
optimization,  a  common  trait  among  global  search  algorithms.  In  biomechanical  optimization 
problems,  noise,  multiple  local  minima,  and  design  variables  of  different  scale  can  limit  the 
reliability  of  gradient-based  algorithms.  The  PSO  algorithm  presented  here  provides  a  simple-to- 
use  off-the-shelf  alternative  for  consideration  in  such  cases.  The  C  source  code  for  our  PSO 
algorithm  is  freely  available  at  www.mae.ufl.edu/~fregly/download/pso.zip. 
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Captions 


Figure  Captions 

Figure  1:  Experimental  shank  and  foot  surface  marker  configuration  (left)  for  developing  a 
subject-specific  kinematic  ankle  model  defined  by  12  parameters  px  through  pl2  (right).  Each 
parameter  defines  the  position  or  orientation  of  a  joint  axis  in  one  of  the  body  segments. 

Figure  2:  Comparison  of  PSO  convergence  history  with  that  of  GA,  SQP,  and  BFGS 
optimization  algorithms  for  the  analytical  test  problems.  Error  was  computed  using  the  known 
cost  at  the  global  optimum  and  represents  the  average  of  1000  (PSO  and  GA)  or  100  (multi-start 
SQP  and  BFGS)  runs  with  each  algorithm,  (a)  Problem  Hx .  (b)  Problem  H1 .  (c)  Problem  //3 
with  n  =  4.  (d)  Problem  //,  with  n  =  32. 

Figure  3:  Final  cost  function  values  for  ten  unsealed  (dark  bars)  and  scaled  (gray  bars)  parallel 
PSO,  GA,  SQP,  and  BFGS  runs  for  the  biomechanical  test  problem.  Each  pair  of  unsealed  and 
scaled  runs  was  started  from  the  same  initial  point(s)  in  design  space,  and  each  run  was 
tenninated  when  the  specified  stopping  criteria  was  met  (see  text). 

Figure  4:  Convergence  history  for  unsealed  (dark  lines)  and  scaled  (gray  lines)  parallel  PSO,  GA, 
SQP,  and  BFGS  runs  for  the  biomechanical  test  problem.  Each  algorithm  was  run  terminated 
after  10,000  function  evaluations.  Only  one  unsealed  and  scaled  PSO  and  GA  run  were  required 
to  reach  10,000  function  evaluations,  while  repeated  SQP  and  BFGS  runs  were  required  to  reach 
that  number.  Separate  SQP  and  BFGS  runs  were  treated  like  individual  particles  in  a  single  PSO 
run  for  calculating  convergence  history  (see  text). 

Figure  5:  Sensitivity  of  SQP  and  BFGS  gradient  calculations  to  selected  finite  difference  stepsize 
for  one  design  variable.  Forward  and  central  differencing  were  evaluated  using  relative 
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convergence  tolerances  of  10  3  and  10  6  for  the  nonlinear  least  squares  sub-optimizations 
performed  during  cost  function  evaluation  (see  Eq.  (20)). 

Figure  Al:  Comparison  of  SA,  GA,  SQP,  and  DS  convergence  history  for  the  analytical  test 
problems  as  reported  previously  by  Soest  and  Casius  [5].  Error  was  computed  using  the  known 
cost  at  the  global  optimum  and  represents  the  average  of  1000  runs  with  each  algorithm, 
(a)  Problem  Hl .  The  SA  results  have  been  updated  using  corrected  data  provided  by  Soest  and 
Casius,  since  the  results  in  [5]  accidentally  used  a  temperature  reduction  rate  of  0.5  rather  than 
the  standard  value  of  0.85  as  reported,  (b)  Problem  H1 .  (c)  Problem  //,  with  n  =  4.  (d)  Problem 
//,  with  n  =  32. 

Table  Captions 

Table  1:  Standard  PSO  algorithm  parameters  used  in  the  present  study. 

Table  2:  Fraction  of  successful  PSO,  GA,  SQP,  and  BFGS  runs  for  the  analytical  test  problems. 
Successful  runs  were  identified  by  a  final  cost  function  value  within  1 0  3  of  the  known  optimum 
value,  consistent  with  [17]. 

Table  3:  Final  cost  function  values  and  associated  marker  distance  and  joint  parameter  root- 
mean-square  (RMS)  errors  after  10,000  function  evaluations  performed  by  multiple  unsealed  and 
scaled  PSO,  GA,  SQP,  and  BFGS  runs.  See  Fig.  4  for  corresponding  convergence  histories. 

Table  Al:  Fraction  of  successful  SA,  GA,  SQP,  and  DS  runs  for  the  analytical  test  problems  as 
reported  previously  by  Soest  and  Casius  [5].  The  GA  and  SQP  algorithm  used  in  that  study  were 
different  from  the  ones  used  in  our  study.  Successful  runs  were  identified  by  a  final  cost  function 
value  within  10  3  of  the  known  optimum  value. 
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Parameter 

Description 

Value 

P 

Population  size  (number  of  particles) 

20 

ci 

Cognitive  trust  parameter 

2.0 

c2 

Social  trust  parameter 

2.0 

w0 

Initial  inertia 

1 

wd 

Inertia  reduction  parameter 

0.01 

K 

Bound  on  velocity  fraction 

0.5 

Vd 

Velocity  reduction  parameter 

0.01 

d 

Dynamic  inertia/velocity  reduction  delay  (function  evaluations) 

200 
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Algorithm 

Hi 

h2 

H3 

(n  =  4) 

("  =  8) 

in~  16) 

CN 

CO 

II 

c: 

PSO 

0.972 

0.688 

1.000 

1.000 

1.000 

0.015 

GA 

0.000 

0.034 

0.000 

0.000 

0.000 

0.002 

SQP 

0.09 

0.11 

0.00 

0.00 

0.00 

0.00 

BFGS 

0.00 

0.32 

0.00 

0.00 

0.00 

0.00 
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RMS  Error 


Optimizer 

Formulation 

Cost 

Function 

Marker 

Distances  (mm) 

Orientation 
Parameters  (deg) 

Position 

Parameters  (mm) 

PSO 

Unsealed 

69.5 

5.44 

2.63 

4.47 

Scaled 

69.5 

5.44 

2.63 

4.47 

Unsealed 

77.9 

5.78 

2.65 

6.97 

GA 

Scaled 

74.0 

5.64 

3.76 

4.01 

Unsealed 

255 

10.4 

3.76 

14.3 

SQP 

Scaled 

121 

7.21 

3.02 

9.43 

Unsealed 

355 

12.3 

21.4 

27.5 

BFGS 

Scaled 

102 

6.61 

18.4 

8.52 
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Algorithm 

Hi 

h2 

H3 

(n  =  4) 

("  =  8) 

in~  16) 

CN 

CO 

II 

c: 

SA 

1.000 

0.027 

0.000 

0.001 

0.000 

0.000 

GA 

0.990 

0.999 

1.000 

1.000 

1.000 

1.000 

SQP 

0.279 

0.810 

0.385 

0.000 

0.000 

0.000 

DS 

1.000 

0.636 

0.000 

0.000 

0.000 

0.000 

